Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UMD bundle for Kedro-Viz #2256

Closed
wants to merge 13 commits into from
Closed

Conversation

ravi-kumar-pilla
Copy link
Contributor

@ravi-kumar-pilla ravi-kumar-pilla commented Jan 30, 2025

Description

Add a UMD Kedro-Viz bundle to be used directly in a html document.

Development notes

  • Created a folder umd in the current kedro-viz GH repository
  • Upload the production bundles kedro-viz.production.min.js and vendors.production.min.js to the umd folder
  • Automate the process of bundling and updating the bundles via make version, i.e., add a step to update this folder when we do a new release

QA notes

  • All tests should pass
  • Test locally to check if the umd bundle is produced by running either make version VERSION=10.3.0 or npm run build:umd. Discard all the changes after testing

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added new entries to the RELEASE.md file
  • Added tests to cover my changes

Signed-off-by: ravi_kumar_pilla <[email protected]>
Signed-off-by: ravi_kumar_pilla <[email protected]>
Signed-off-by: ravi_kumar_pilla <[email protected]>
Signed-off-by: ravi_kumar_pilla <[email protected]>
…at/umd-viz-bundle

Signed-off-by: ravi_kumar_pilla <[email protected]>
Signed-off-by: ravi_kumar_pilla <[email protected]>
@ravi-kumar-pilla
Copy link
Contributor Author

Hi Team,

I am providing some notes and updates on the kedro-viz bundle processing. I have few questions and will need some discussion before we publish the bundle.

Summary:

We will have 2 bundles -

  1. For production: kedro-viz.production.min.js and vendors.production.min.js (1.51MB)
  2. For development: kedro-viz.development.min.js (2.66MB), kedro-viz.development.min.js.map (8.4MB) (traceback enabled)

In notebook use case we will construct a HTML and refer the production bundle via CDN as below in our backend implementation -

<script src="https://www.unpkg.com/@quantumblack/kedro-viz@latest/lib/umd/vendors.production.min.js”></script>
<script src="https://www.unpkg.com/@quantumblack/kedro-viz@latest/lib/umd/kedro-viz.production.min.js"></script>

Question: If we decide on adding umd bundle to current npm package, the size of the package will increase. For the npm package itself the umd bundle is not needed. If we plan on including dev bundle, that would be a disaster.

  1. Should we publish the umd related scripts to a different package (kedro-viz-umd ?)
  2. Should we just upload in our GitHub repo ? (This seems to be a better option for now)

cc: @rashidakanchwala @astrojuanlu @jitu5

NOTE: Total umd folder size including dev and prod is around 13MB. If we do not want traceback to detect errors, we can ignore the dev bundle.

Exploration:

UMD bundle optimizations and next steps:

Analysis was done using webpack-bundle-analyzer

Started with a 12MB parsed size bundle without any optimizations and 5MB minified bundle
image

Vendor libraries in the bundle -

image

Major chunk being - plotly, react-dom, loadash, @apollo/client

Some optimizations -
I already made plotly as external dependency, at least for notebook use case we are not planning on metadata panel and by default all datasets are memory datasets

Below are the plotly references in the codebase:
image

  • React-DOM - This can be made external as well and we need to import via CDN. The issue with making this as external dependency is compatibility with KedroViz as we currently test with specific React and ReactDOM versions. (We might get errors like - Cannot read properties of undefined (reading '__SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED') :D)

  • Lodash is not used at scale in KedroViz. So we can find cheaper alternatives to reduce the bundle size

  • @apollo/client will be removed once ET is removed

    image

After basic optimizations like tree shaking, sideEffects, minimizer and dropping console, comments

image

After optimizations like making external dependencies (plotly), tree shaking, sideEffects, minimizer and dropping console, comments
image

Final prod bundle size (1.51MB) -


image

image

NOTE: Adding more optimizations like - devtool: 'nosources-source-map', compression-webpack-plugin (using Brotli algo) apart from Terser did not impact bundle size

For your information:

  • Terser: Reduces the raw code size by removing unnecessary content.
  • Brotli: Shrinks the payload size by compressing the minified files even further for network transmission. (This does not yield any better results)

Dev Bundle (2.66 MB) with traceback kedro-viz.development.min.js.map (8.4MB) - We need a dev bundle for debugging

image

DevBundle traceback in devTools
image

ProdBundle traceback in devTools (there is no source code to traceback the error to shareable-url-metadata.js)
image

Some known errors -

  1. Since KedroViz interacts with browser window, localStorage and history of browser, these errors happen as we run the app in an iframe (Need to confirm with Jitendra if there is a way to tackle this in the app). This error appears only on interactions within the viz window (like clicking on nodes etc, the graph itself looks fine)

image

  1. Sometimes the graph is distorted and it fits fine on re-run but the initial draw has an issue (This is random and happens when there is viz is in multiple cells, run using Run All Cells, in an iframe)

    image

  2. If we do not use iframe viz pollutes browser history/urls and if viz is run in multiple cells, it can cause errors

image

  1. Try to shift /deploy-viz-metadata API call. i.e., do not make this call on initial load rather only when needed.

    image

  2. Add a prop in viz to disable slice feature

Testing:

Testing demo_project in notebook -
You can try the notebook available in PR - https://github.com/kedro-org/kedro-viz/pull/2241
https://github.com/kedro-org/kedro-viz/blob/feat/viz-pipe/demo-project/viz_jupyter_test.ipynb

Snippet:

from src.demo_project.pipeline_registry import register_pipelines
demo_pipe = register_pipelines()

NotebookVisualizer().show(pipeline=demo_pipe, options={ "display": {
                "expandPipelinesBtn": False,
                "exportBtn": False,
                "globalNavigation": False,
                "labelBtn": False,
                "layerBtn": False,
                "metadataPanel": True,
                "miniMap": False,
                "sidebar": False,
                "zoomToolbar": False,
            },
            "expandAllPipelines": False,
            "behaviour": { 
                "reFocus": False,
            },
            "theme": "dark"})

Next steps: After the initial bundle release

  • Further reducing bundle size (Next 10 medium size libraries. Need to confirm with the team before making them external deps) -
  1. Lodash (Try to find any alternative if possible)
  2. @mui
  3. @apollo/client, graphql (This will be gone with ET)
  4. D3, d3-*
  5. @wry
  6. React-custom-scrollbars-2
  7. Stylis
  8. Optimism
  9. Cross-fetch
  10. Kiwi
  • Testing on databricks
  • Additional testing with other use cases like custom catalog, datasets etc.
  • Switch to ESM build (Some issues while importing the module way, this needs more time to investigate)

Thank you

@astrojuanlu
Copy link
Member

This is amazing work 🔥 thank you @ravi-kumar-pilla!

What you describe makes me think that maybe we could try to separate the flowchart rendering part from the rest of the Viz frontend in a separate library, that would then be used by Viz. If I'm picturing this correctly, the flowchart rendering part wouldn't need to access local storage, browser history etc, so on the notebook we could use just that without fundamentally changing the architecture of the Kedro Viz web application. How does this sound?

Otherwise, the efforts to reduce the bundle size are promising but the other issues remain (weird distortion, inability to run on an iframe, url mangling)

@rashidakanchwala
Copy link
Contributor

rashidakanchwala commented Feb 3, 2025

Hi @ravi-kumar-pilla, @astrojuanlu

This is amazing spike work @ravi-kumar-pilla —it has given us a lot to think about. I really appreciate the effort you’ve put into it

We need to evaluate maintainability, effort vs. impact, and alignment with future Kedro-Viz developments. Given that the second half of 2025 will focus on the Pipeline Editor, which will be a major architectural shift, I think we should take a step back and plan PS sessions to align on the best direction for Kedro-Viz.

Next Steps, can we do a PS session:

  • Original problem - Kedro-Viz in Notebooks
    • Reviewing the user feedback we’ve received so far on the high fidelity prototypes
    • Understand what we can do to release an MVP soon
  • UMD Bundling
    • What additional benefits would UMD bundling bring beyond notebook integration? cons to this?

and also another session on :-

  • Broader Kedro-Viz Architecture

@astrojuanlu
Copy link
Member

Good points @rashidakanchwala , let's continue the conversation in #1993

…at/umd-viz-bundle

Signed-off-by: ravi_kumar_pilla <[email protected]>
Signed-off-by: ravi_kumar_pilla <[email protected]>
Signed-off-by: ravi_kumar_pilla <[email protected]>
Signed-off-by: ravi_kumar_pilla <[email protected]>
Signed-off-by: ravi_kumar_pilla <[email protected]>
Signed-off-by: ravi_kumar_pilla <[email protected]>
@ravi-kumar-pilla ravi-kumar-pilla changed the title Feat/umd viz bundle Add UMD bundle for Kedro-Viz Feb 5, 2025
Signed-off-by: ravi_kumar_pilla <[email protected]>
@ravi-kumar-pilla ravi-kumar-pilla marked this pull request as ready for review February 5, 2025 01:46
@astrojuanlu
Copy link
Member

astrojuanlu commented Feb 5, 2025

1 more thing @ravi-kumar-pilla , could you add a bit more detail around why UMD and not ESM? Just for future reference.

My very basic understanding is that "ESM is the best module format thanks to its simple syntax, async nature, and tree-shakeability", whereas "UMD works everywhere and usually used as a fallback in case ESM does not work".

https://dev.to/iggredible/what-the-heck-are-cjs-amd-umd-and-esm-ikm

@ravi-kumar-pilla
Copy link
Contributor Author

1 more thing @ravi-kumar-pilla , could you add a bit more detail around why UMD and not ESM? Just for future reference.

My very basic understanding is that "ESM is the best module format thanks to its simple syntax, async nature, and tree-shakeability", whereas "UMD works everywhere and usually used as a fallback in case ESM does not work".

https://dev.to/iggredible/what-the-heck-are-cjs-amd-umd-and-esm-ikm

Hi @astrojuanlu , Yes the basic difference is what you have mentioned here.

I tried using the esm bundle but could not get the app running as I mentioned in my comment above. There needs some more changes to the app for the bundle to work. I can have a look at it now.

For the initial release, I thought we can go ahead with the UMD bundle since it is working fine and it is widely compatible with older environments. I have added the tree shaking optimization to the umd bundle which makes is more efficient (esm provides this by default), so our bundle is closer to esm with these optimizations. I will experiment today and provide more details on the esm bundle.

Thank you

@ravi-kumar-pilla
Copy link
Contributor Author

ravi-kumar-pilla commented Feb 6, 2025

Hi @astrojuanlu ,

I tested with esm module and made some changes. It is working now but we are unable to split the js into chunks (i.e., vendors.js and kedro-viz.js). This is not a big issue. As you might already know that esm is pretty new but it is considered the new standard for bundling. Even React19 dropped UMD and adopted ESM workflows. So we can go ahead with ESM for the initial release and if users face any compatibility issues, we can fallback to UMD.

Also, I made React, React-DOM, lodash and plotly as externals. The pros of doing this is it reduces bundle size (1.3MB) and makes the kedro-viz bundle independent of React version. The cons are, since we do not yet support React19, if users directly use the esm bundle, they might face issues. For the notebook usecase, this is not an issue as we build the template in the backend and make sure the versions match.

@jitu5 do you think we should bundle the external dependencies together so it would be easy for anyone to directly use the bundle ( if we include the above dependencies it would be 1.5MB, plotly can still be external ) ?

Thank you

@ravi-kumar-pilla
Copy link
Contributor Author

closing this in favor of #2268

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants